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Factorization underpins our ability to make predictions at the LHC, both in Monte Carlo sim- 
ulations and direct calculations. An improved theoretical understanding of jet substructure can 
lead to calculations that can confront data and validate the Monte Carlo description of jets. We 
derive constraints on jet substructure algorithms from factorization, focusing on the broad class of 
jet observables where the soft and collinear dynamics of QCD dominate. A necessary condition 
for factorization is that the phase space constraints on soft and collinear dynamics for a given ob- 
servable are independent of each other. This condition allows us to use a simple power counting 
analysis to place strong constraints on the form of jet substructure methods that can factorize. 
We illustrate this approach by considering four substructure algorithms, the mass-drop filter Higgs 
tagger, pruning, trimming, and iV-subjettiness. We find that factorization constrains the scaling 
of parameters in all of the substructure algorithms, and that the generic declustering and filtering 
techniques do not factorize. However, we determine simple modifications that can be made to allow 
for factorization. 
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I. INTRODUCTION 

Jets play a central role in the search for new physics at 
hadron colliders. Jet algorithms allow us to map compli- 
cated hadronic final states onto the underlying hard in- 
teraction and therefore are important in both finding new 
physics signals and understanding the dominant back- 
grounds from QCD. 

At the new energy frontier of the LHC, heavy parti- 
cles such as electro- weak bosons are often produced with 
sufficient boost that their hadronic decay products are 
reconstructed in to a single "fat jet" using traditional jet 
algorithms. This makes their identification challenging. 
There has been much recent theoretical progress in devel- 
oping jet substructure techniques, which have two goals. 
First, they use the high px substructure in the jets to 
improve the discrimination between jets from the decay 
of heavy boosted objects and large px "QCD jets" from 
the QCD evolution of a colored parton. Second, they 
also aim to remove the contamination of soft radiation 
from underlying event and pileup, which is important to 
improve the resolution of measurements for large jets. 

Jet substructure tools can extend the reach of a va- 
riety of SM analyses and provide new search strategies 
for BSM physics. Substructure algorithms have been de- 
veloped for specific channels, such as top-tagging [IHO], 
PF-tagging [TJ QJjlJTT], Higgs + W/Z production, resur- 
recting the h — > bb search channel that would otherwise 
be swamped by the W/Z + jets background [T2HT4] . and 
new physics channels [15H26) . Substructure algorithms 
have also been developed for generic boosted topologies, 
which are referred to as jet grooming [2"?Tl29| . These 
methods have been shown to be broadly effective at tag- 
ging boosted decays and removing contamination from 
underlying event and pileup. Jet shapes, which measure 
a function of the momenta within a jet without placing 
strong exclusive cuts on the phase space have recently 
been shown to be effective in tagging boosted tops [H 0- 
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9 . By avoiding the rather complex and exclusive phase 
space cuts "traditionally" imposed by jet substructure 
techniques, jet shapes are generally simpler and more 
amenable to direct calculation. 

Many jet substructure algorithms are beginning to be 
verified on data [50"rf52"] and as with other jet tools, it 
is important to understand them from a theoretical per- 
spective. Currently we rely on Monte Carlo (MC) predic- 
tions for jet substructure, which are relatively untested 
and may be susceptible to large, poorly quantified un- 
certainties. The MC lacks perturbative accuracy in de- 
scribing the evolution of the jet, which is compensated 
by good modeling of hadronization and non-perturbative 
effects. Calculations of jet substructure can be compared 
to data and help validate and improve the MC descrip- 
tion of jet substructure. Furthermore, the plethora of 
available jet substructure techniques poses a challenge in 
exploring them experimentally. The Boost 2010 Monte 
Carlo study [33] suggests that many methods have sim- 
ilar quantitative performance, meaning it is important 
to identify any characteristics of substructure algorithms 
that make them theoretically attractive. 

Our ability to make reliable theoretical predictions at 
the LHC relies on factorization, which is the separation 
of the cross section into pieces that depend on separate 
energy scales. There exists a hierarchy of perturbative 
scales in most jet physics applications, i.e., the jet pr of 
order y/H, the invariant mass of the jet, mj, which is typ- 
ically much smaller. In this case the soft and collinear 
limits of QCD provide the correct description of jet evolu- 
tion, where the collinear dynamics generates high energy 
radiation collimated with the jet direction and the soft 
dynamics governs much lower energy global radiation in 
and between jets. This gives rise to large logarithms in a 
fixed order perturbative description of a jet that must be 
resummed to control the perturbative series and regain 
accuracy. Factorization is necessary not just in separat- 
ing perturbative and non-perturbative effects, but also in 
disentangling the scales set by the dynamics of soft and 
collinear evolution that describe a jet, see Fig. [T] This 
is essential for rcsummation of large logs carried out by 
renormalization group evolution. 

This work examines a necessary condition for factor- 
ization for jet observables. We focus on soft-collinear 
factorization, the separation of the soft and collinear con- 
tributions to a jet observable, which is a key step in prov- 
ing factorization theorems for a wide range of jet physics 
applications, including most jet algorithms, substructure 
methods, and jet shape observables. A necessary condi- 
tion for soft-collinear factorization is that the phase space 
cuts generated by the observable being measured con- 
strain the collinear dynamics independently of the soft 
radiation, and vice versa to all orders in perturbation 
theory. We refer to this as soft-collinear observable phase 
space (OPS) factorization. 



We find that a straightforward 1 power counting anal- 
ysis can be used to test whether OPS factorization holds 
for a given jet observable by using Soft Collinear Effective 
Theory (SCET) [3JH37], an effective field theory (EFT) 
of QCD at high energies. The EFT framework is very 
useful because it provides a systematic power counting 
which allows one to quantitatively determine the soft and 
collinear contributions to a jet observable and the correc- 
tions to this limit. SCET has been applied to a broad 
range of jet physics problems, many relevant for jet sub- 
structure. It has been used to understand e + e~ event 
shapes at high accuracy [38rt42| . jet algorithms and jet 
shapes in multijet events [4"3H4l)] . and ways to use event 
shapes to veto against jet-like events at the LHC [47H49] . 
The framework of SCET can be used to analyze whether 
the phase space cuts generated by the observable that 
govern the soft and collinear dynamics factorize. It al- 
lows one to easily determine the key qualitative features 
of the algorithm or observable, which we find is informa- 
tive, especially for jet substructure. When factorization 
fails, this analysis can be used to diagnose problems and 
identify modifications that allow for factorization. The 
analysis of OPS factorization cannot by itself prove fac- 
torization, but is a key component in the proof. 

Jet algorithms, substructure methods, or jet shapes 
for which soft-collinear phase space constraints do not 
separate cannot be factorized. If large logarithms arise 
in a fixed order perturbative series, then although the 
perturbative series is calculable, these large logs cannot 
be resummed and accurate predictions cannot be made. 
This has impact not only on a calculation but on the 
MC as well, as factorization underlies the MC. If factor- 
ization is not respected for the leading logarithms, then 
it endangers the predictability of the MC. 

The outline of the paper is as follows: in Sec. [IlJ we 
discuss modes and power counting in SCET, followed 
by a review of factorization in the effective theory. In 



Sec. Ill we show how we can use power counting of soft 
and collinear modes in SCET to test for OPS factoriza- 
tion, and we present general constraints that apply to jet 
substructure methods. We illustrate the power count- 
ing analysis of OPS factorization by applying it to the 
kx class of jet algorithms (kp, Cambridge/ Aachen, and 
anti-k T ) [5UrI54"] . as well as the JADE algorithm [551 155] . 
which is known not to factorize. The power counting 
analysis also lets us easily determine characteristic prop- 
erties of each jet algorithm. In Sec. |IV[ we apply our 
test to four substructure algorithms: the mass-drop fil- 
ter Higgs tagger, pruning, and trimming, in addition to 
the jet shape A-subjettiness. We find that the general 
procedures of declustering and filtering used in the Higgs 
tagger do not factorize. We discuss simple modifications 
that can restore factorization, and study the impact on 



1 By straightforward we mean that it can be performed without 
technical expertise in SCET. 
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the effectiveness of the Higgs tagger. We find that both 
pruning and trimming factorize if the parameters of the 
algorithms are chosen to have the appropriate scaling 
with the power counting parameter of SCET. We use the 
power counting analysis to build a simple picture for the 
behavior of pruning, and we find this picture agrees well 
with Monte Carlo simulation. We discuss factorization 
constraints for generic jet shapes and iV-subjettiness in 



particular. Finally, in Sec. VI we conclude 



II. 



FACTORIZATION AND POWER 
COUNTING 



In this section we highlight the distinction between 
the factorization of a hard scattering process and soft- 
collinear factorization, which is required to obtain a re- 
liable perturbative description when there exists a hier- 
archy of scales in a jet observable. Factorization requires 
that the phase space constraints generated by the ob- 
servable act on soft and collinear final states indepen- 
dently of each other. We refer to this as OPS factoriza- 
tion. Power counting in SCET makes it straightforward 
to test this necessary condition for soft-collinear factor- 
ization. We review SCET power counting and illustrate 
the steps needed to prove factorization with the example 
of the dijet jet mass distribution. 



A. Soft Collinear Factorization 

Hard scattering processes at hadron colliders with no 
constraints on the final state satisfy a familiar factoriza- 
tion: 



^ ^ J d%i d,Xj fi^Xi^fj^Xj^ (Tijixij Xj*) 



(1) 



where aVj is the hard scattering cross section that has 
been factorized from the parton distribution functions 
fij. This factorization separates the calculable, pertur- 
bative hard scattering at the center-of-mass energy scale, 
yf§, from the measurable non-perturbative parton lumi- 
nosities at the scale Aqcd- Eq. ([!]) reflects the basic ac- 
tion of factorization: it divides the cross section into sep- 
arately calculable or measurable pieces that each depend 
on dynamics at different scales. This basic factorization 
theorem is widely applied to make theoretical predictions 
for hadron collisions. For example, it is part of the un- 
derlying assumptions used in MC programs to generate 
events. 

The utility of the hard scattering factorization in 
Eq. (JfJ is limited, since it applies only to fully inclusive 
cross sections. In all jet physics applications the cross 
section is made more exclusive through final state cuts 
and the factorization theorem becomes more complex, if 
it holds at all. It is important to identify which applica- 
tions fail factorization, since it greatly compromises our 



ability to make reliable theoretical predictions for such 
observables and therefore limits their utility. 

For the jet mass, there is a natural hierarchy of scales 
in a jet between the jet pt, the invariant mass of the 
jet, mj, and the non-perturbative scale Aqcd, where 
Pt ^> rrij ^> Aqcd- The perturbative scales associated 
with the soft and collinear dynamics governing the jet 
mass are shown in Fig. [T] This hierarchy of scales can 
lead to large Sudakov double logarithms, giving terms 
like a™ In 2 ™ mj /pr in perturbation theory. This can 
spoil the convergence of the perturbative expansion un- 
less these logarithms are resummed to all orders. 

Restoring perturbative control requires the hard fac- 
torization in Eq. ([!]) to be extended by showing soft- 
collinear factorization. When soft-collinear factorization 
can be shown, the non-perturbative contribution of the 
hadronic initial state is factorized from the perturbative 
components, as in Eq. ([I]), while &ij is further factorized 
by separating the collinear dynamics at the scale mj from 
global soft radiation at the scale m/j/pr- 

SCET provides a systematic framework to carry out 
soft-collinear factorization in the appropriate limit of 
QCD. For a pp — > AT-jet observable, such a factorization 
theorem takes the schematic form [13 H3 \57 - 59 



cjjv = H 



N 



B a x B b x Y[ Jk 



k=l 



S 



N 



(2) 



The hard function Hjy comes from the short-distance 
hard scattering process, while the beam functions B and 
jet functions J come from the collinear evolution of the 
initial and final hard partons from the hard scattering. 
The beam functions have an internal factorization con- 
taining the parton distribution functions. Finally, the 
soft function Sn comes from global soft radiation. The 
presence of the additional components in the factoriza- 
tion theorem in Eq. ^ compared to Eq. (I]) reflects the 
fact that new scales are introduced in to the problem by 
the final state cuts which define the jets. When factor- 
ization can be shown, OPS factorization is the physical 
reason that we can separate the collinear and soft evolu- 
tion of the jets. It states that the soft and collinear con- 
tributions to the measurement are independent. Such a 
factorization theorem is neither guaranteed nor assumed, 
as is often the case for hard factorization in Eq. ([!]). 



B. Power Counting in SCET 

Soft and collinear radiation in QCD provides the dom- 
inant description of jet evolution in many jet physics ap- 
plications. In these cases there is typically a hierarchy of 
scales related to the observable. The effective field the- 
ory SCET provides a systematic expansion around the 
soft-collinear limit of QCD below a hard scale, which 
simplifies the proof of soft-collinear factorization. This is 
done by expanding in a power counting parameter A <C I 
and gives rise to soft and collinear modes, which become 
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FIG. 1: Hierarchy of scales relevant for widely separated col- 
limated jets. 



fields in SCET. For jet mass, A = mj/Ej. Their momen- 
tum scaling, expressed in terms of light cone components 
P p = {n-p,n-p,px) = (p + ,p~,p_l), is 2 

collinear: p c = (pt,p~,Pc) ~ Ej(X 2 , 1, A) , 

soft: p a = (pf,p-,pj)~Ej(\ 2 ,\ 2 ,\ 2 ), (3) 

where n and n are light-cone vectors, = (1, n) and 
= (1, — n), with n 2 = 1. The jets shown in Fig. [l] 
have separate collinear modes, with momentum scaling 
defined with respect to the jet direction n,. The exchange 
of collinear radiation between different jet directions has 
invariant mass of order the hard scale pt ~ Ej and is 
integrated out of the effective theory, but collinear radia- 
tion inside of a jet remains. We note that the large com- 
ponent of collinear modes scales as the jet energy, so the 
relevant logs are mj/Ej. For light jets, Ej — pxcoshy, 
and the pseudorapidity factor can enhance logs of mj/pT- 
Soft modes, which have homogenous scaling, can be ex- 
changed between different collinear sectors. 

The momentum scaling in Eq. |3]) simplifies the physi- 
cal picture of jet events. This is seen from the kinematics 
of soft and collinear modes. The relative scaling of their 
energies is 



E r 



(4) 



Therefore the energy, or equivalently transverse momen- 
tum, of the jet is determined by the collinear particles in 
the jet, up to power corrections of C(A 2 ) from soft par- 
ticles. The angle, AR 2 = Arj 2 + Acf> 2 , between collinear 
particles in the same jet (collinear sector) is 



AR C 



X. 



(5) 



Any four-vector can be decomposed as 

" M . _ n>" 

~2 



n ■ p- 



The angle between soft particles, which interact globally 
in the event, is 0(1): 



AR S 



1 



(6) 



Finally we can use power counting to show that the angle 
between a soft particle and any collinear particle in a jet 
is the same at leading power and is equal to the angle of 
the soft particle to the jet axis. For the polar angle 9 CS , 
the relation is 



2p e • p s 

E r Ee 



2(1 
2(1 



4^ 



COS Or, 



Ps +Ps 
2\ 



0(\ 2 ) 



S ) + <W 

An analogous relation holds for AR: 

AR CS = AR ns + 0{\ 2 ) 



(7) 



(8) 



The angle AR ns scales as A . Soft particles do not dis- 
tinguish the individual collinear splittings in a jet, but 
instead see each collinear sector as a color source moving 
along the light-like direction n,. The combination of a 
soft and collinear particle in a jet, p c + p s , only changes 
the 0(A 2 ) component of the collinear momentum: 



P< 



+ Ps = (p-,p++p+,pj)[l + 0(X)} 



(9) 



and p 



■ P + P + Pi 



Since the p~ and pjr components determine the energy 
and direction of the collinear mode, these are unchanged 
up to power corrections. 

The explicit scaling of soft and collinear modes lets us 
construct a simple physical picture of the jet and gives us 
a framework to explore jet properties. Collinear particles 
are restricted to a narrow region of angle ~ A around 
the jet axis, and carry the total energy of the jet up to 
power corrections in A. Soft particles populate the entire 
jet, and they cannot resolve individual collinear particles. 
Instead, soft particles only resolve the jet direction. The 
measurement can add dependence on the total energy of 
each collinear sector. This basic picture can let us study 
the properties of different jet algorithms, which we do in 
Sec. HUB 

It is important to note that the scaling of the kinemat- 
ics of soft and collinear modes is a characteristic scal- 
ing. Soft and collinear particles can contribute to the jet 
cross section at leading power in regions of phase space 
when their momenta is parametrically smaller than the 
characteristic scaling in Eq. ([3]) (namely when integrat- 
ing over phase space in real and virtual corrections). For 
example, the characteristic scaling of the angle between 
collinear particles is given in Eq. ([5]), but can also con- 
tribute to the jet observable for much smaller values of 
C(A 2 ). We take this in to account when necessary in 
our analysis of jet substructure methods. Notice that 
if the large component of the collinear momentum, p~ 
in Eq. becomes C(A 2 ), the collinear and soft modes 
overlap. This double-counting is explicitly removed from 
the effective theory by the zero-bin subtraction. In the 
next section, we see how power counting is applied to 
prove soft-collinear factorization. 
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C. Factorization in SCET 

Factorization requires proving that the cross section 
can be separated to all orders in perturbation theory into 
matrix elements that depend on dynamics at different 
scales. In SCET, factorization will express the cross sec- 
tion in a form like Eq. ([2| , with independently calculable 
hard, jet, and soft functions. Although formal factor- 
ization proofs require a significant amount of technical 
machinery, a simple physical statement is at the heart of 
soft-collinear factorization. In SCET, this statement is 
that the soft and collinear contributions to the observ- 
able are distinct at all orders. A formal factorization 
proof requires four essential steps: 

1. We must show that soft and collinear dynamics are 
the relevant modes to describe the final states and ob- 
servable we are interested in. In particular, the scales 
in the problem are such that SCET with the modes 
in Eq. ^ (formally known as SCETi) 3 provides the 
correct EFT description. 

2. Interactions between soft and collinear fields in the 
SCET JV-jet operator must not exist at leading or- 
der in A. This soft-collinear decoupling is process in- 
dependent and follows straightforwardly in the con- 
struction of SCET [31]. 

3. For the observable of interest, the phase space con- 
straints in the soft and collinear sectors must be in- 
dependent of each other to all orders in perturbation 
theory. We refer to this as soft-collinear observable 
phase space factorization (OPS), and the focus 
of this paper is to show that a simple analysis can 
determine if it is satisfied. 

4. An operator definition must be supplied for all com- 
ponents in the factorization theorem. This includes a 
measurement operator which imposes the final state 
cuts in the factorized cross section. 

We discuss these points with an example, e + e~ — > 2 
hemisphere jets at center of mass energy Q where we 
measure the mass 7711,2 in each hemisphere. In the limit 
of small jet mass, soft and collinear modes describe each 
jet and SCETi can be used to sum the logs of mi 2 /Q 
l5Url6"3"] . Up to power corrections in m 2 2 /Q 2 , each jet 
mass is a simple sum, 

m2 J = Q ( n ■ P C i + J2 n ■ ) = m 2 c + to 2 , (10) 



3 SCETn is a formulation of SCET with soft modes whose mo- 
menta scale as p s ~ Pc (A, A, A). Since these modes are paramet- 
rically more energetic than the soft modes in Eq. |3|, soft modes 
in SCETj are often referred to as ultrasoft modes. 



where m c ( s ) is the collinear (soft) contribution to the jet 
mass. The scales associated with the jet and soft func- 
tions can be determined by the collinear and soft contri- 
butions to the observable. In SCETi, the scales in terms 
of the power counting parameter are 

A*j ~ QA , Ms ~ QA 2 . (11) 

The contributions to the observable then give the scales 
fij^s in terms of m\ t 2- 

ml = Qn ■ p c ~ Q 2 \ 2 => fij ~ mi,2 , 

ml = Qn ■ p s ~ Q 2 A 2 =>■ /j s ~ m\ 2 /Q . (12) 

The QCD cross section can be written as a forward scat- 
tering matrix element of the current: 

a(m 1 ,m 2 ) ~ (0| ^ CD K(mt, m 2 ) Jqcd |0) . 

The restriction operator lZ(m\, m 2 ) imposes the mea- 
surement on the final state, 

lZ(mi,m 2 ) — 8{m\ — fhi)S(m 2 — fh 2 ) , (13) 

where fhi^ measure the mass of each hemisphere jet on 
a given final state: 

rai,2|A homi 1]2 ) = mi.al-Xhemi 1,2) • (14) 

In SCET an JV-jet final state is described by an opera- 
tor, On, with N-collinear sectors coupled to soft modes 
which are exchanged between them. For the dijet invari- 
ant mass, the SCET matrix element is 

(0| 0\n{mi, m 2 )O 2 \0). (15) 

In order to factorize the cross section in SCET we have 
to carry out steps 2 through 4 in the above list. We must 
factorize the 2-jet operator and the restriction operator 
into soft and collinear pieces. 

The BPS field redefinition performs soft-collinear de- 
coupling of the operator, factorizing O2 schematically in 
to 

2 ^[0 C1 ][0 C2 ][0 S ], (16) 

where the two collinear sectors c\ : 2 do not couple to each 
other or the soft operator O s to leading power in A. This 
satisfies step 2 for any 2-jet observable of interest. 

Factorization of 1Z divides the operator into separate 
measurements in the collinear and soft sectors. At a for- 
mal level, this factorizes the operator: 

n = ii s ® lii » (it) 

i 

where the convolution comes from the observable depen- 
dence in each operator. The soft restriction operator 1Z S 
acts only on soft modes, while the collinear restriction 
operator 1Z 1 C acts only on the collinear modes in sector i. 



() 



OPS factorization tests whether this holds by examining 
the phase space constraints in each sector. For the jet 



mass, this is straightforward: from Eq. (10 1, the soft and 



collincar particles in the jet linearly contribute to the 
mass and the jet boundaries enforced by the measure- 
ment are determined by the jet directions. This satisfies 
step 3 in the above list. For jet algorithms, OPS factor- 
ization requires showing that the constraints that specify 
which soft and collinear particles are put in the jet are 
independent of the other sectors. For jet substructure, it 
requires showing that the kinematic cuts placed on the jet 
constituents separate into cuts on the soft and collincar 
phase space. For example, cuts that remove wide an- 
gle soft radiation in the jet must be independent of the 
collincar phase space. 

Formally, factorization of the restriction operator re- 
quires providing operator definitions for m{\ in step 4. 
For the jet mass (equivalent to thrust t in the small mass 
limit), these operators were constructed explicitly from 
the energy-momentum tensor in 39J. This, coupled with 
the arguments of soft-collinear factorization, shows 



n,2 



'1,2 



'1,2 ' 



(18) 



from which Eq. (17) follows. Factorization of the restric- 



tion operator allows us to write the cross section in the 
form 

^(m!, m 2 ) = H 2 J dmidm 2 6{m\ — m c 1 — m\) (19) 

The jet and soft functions are forward scattering matrix 
elements of soft or collinear fields with restriction opera- 
tors that act only on the phase space of that sector. We 
will explore OPS factorization with several jet algorithm 
and substructure examples in the remainder of this pa- 
per. 



III. A TEST FOR SOFT-COLLINEAR 
FACTORIZATION 

In this section we analyze a necessary condition for fac- 
torization relevant to a wide class of jet observables where 
the soft and collinear dynamics of SCETi provide the rel- 
evant description. We focus on a key step in the proof of 
factorization as outlined in Sec. |II C| OPS factorization. 
This is powerful at detecting when factorization fails, and 
it is sensitively dependent on the precise definition of the 
jet observable. It is also very useful in characterizing the 
behavior of algorithms or substructure. 

The jet and soft functions give the contribution of soft 
and collinear modes to the cross section. These functions 
implement the phase space constraints from the observ- 
able, which can be expressed as a restriction operator 1Z 
that acts on the final state. OPS factorization splits the 
restriction operator into separate pieces that only act on 
modes in a given sector, as in Eq. fl7| ). Proving OPS 
factorization requires showing the following: 



• To leading order in the power counting parameter 
A, the phase space constraints on soft and collinear 
modes must be independent of each other to all or- 
ders in perturbation theory. 

We can test this by carrying out a simple power counting 
analysis on the algorithm or observable, and generally 
find that it can be very constraining. 

Note that the soft modes are not completely ignorant 
of the collinear modes. The soft Wilson line knows about 
the direction of the associated collinear sector, and the 
measurement can add dependence on the total energy 
in each collinear sector. However, the phase space con- 
straints in the soft sector must be independent of in- 
dividual collinear splittings or momenta. Similarly, the 
collincar phase space constraints must be independent of 
individual soft momenta. If the phase space constraints 
are not independent, then the measurement operator can- 



not factorize, as they do in the example in Eq. ( 18 ), since 



the action of 1Z S or 1Z C would depend on both collincar 
and soft states. 



The factorization in Eq. (17 1 must hold at all orders 
in a s ; it is not sufficient to consider cases at lowest order 
in perturbation theory. However, we will find studying 
fixed order configurations can be very useful and can of- 
ten be generalized to all-orders arguments. For instance, 
since recombination-style jet algorithms deal with pairs 
of particles, often simple 0(a 2 s ) configurations are suffi- 
cient to make broader statements about the behavior and 
factorizability of an algorithm. 

Because jets cover a wide kinematic range, factoriza- 
tion constraints will depend on the types of cuts placed 
on the jet. The additional scales present in jet substruc- 
ture algorithms can require factorization that goes be- 
yond the simple soft-collinear factorization that we dis- 
cuss here. In any case, we must show that step 1 in the 
list from Sec. [H] still holds, which now may require addi- 
tional modes. An example relevant for jet substructure is 
the case where two jets (or subjets) come close together 
|64) . This factorization is described by SCET+, an ex- 
tension of SCETi that includes an additional mode to 
describe soft radiation into the nearby jets. Another pos- 
sibility is that a different description of soft and collincar 
modes is needed, such as SCETn, where the components 
of soft momentum scale as A instead of A 2 . However, soft 
and collinear modes are still at the heart of jet physics 
problems, and OPS factorization is still a key part of fac- 
torization in these cases. We discuss alternative forms of 
factorization in more detail in Sec. W\ 



A. The Structure of Jets in SCET 

In this section we highlight some of the generic con- 
straints imposed by OPS factorization on jet algorithms 
and substructure methods. We use the structure of the 
jet implied by power counting of the soft and collinear 
modes in SCET. 
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Consider a jet found by a recombination algorithm, 
which builds jets through 2—^1 merging of particles. 
The recombinations that make the jet in SCET consist 
of three different kinds of merging of soft and collinear 
modes: 



c, c 



c, s — s- c , 



s, s 



(20) 



It is useful to depict the sequence of mergings as a tree. 
This illustrates the action of the algorithm, which gen- 
erates the phase space constraints of the jet and should 
not be mistaken with the jet evolution. In QCD each 
jet has its own tree. However, with soft and collinear 
modes we have separate trees for each type of mode. 
Each jet in the event is therefore represented by two 
recombination trees. Collinear-collinear recombinations 
are confined to the tree of collinear particles, and soft- 
soft recombinations are confined to the tree of soft par- 
ticles. Soft-collinear recombinations are special; recall 
Eq. ([9]) , which shows that the collinear energy and direc- 
tion are unchanged up to power corrections. Therefore, 
the c, s — > c merging is represented in the soft tree. Since 
the soft mode can only resolve the jet direction, and not 
the direction of the specific collinear mode, the soft par- 
ticle only sees that it is recombining with "the jet" , rep- 
resented by a Wilson line in the EFT. This is equivalent 
to the statement in Eq. (Jsj) , that AR CS = AR ns +0(X 2 ). 
The structure of the trees of soft and collinear modes is 
shown in Fig. [2j 

Recombinations in a Jet 

collinear sector soft sector 



merging 
time 





collinear particles 



jet 




V 

K soft 
particles 



FIG. 2: Recombination structure of the soft and collinear par- 
ticles in a jet. Because the energy and direction of a collinear 
particle is unchanged after merging with a soft particle, soft- 
collinear recombinations are represented in the soft tree. And, 
since soft modes can only resolve the jet direction and not 
individual collinear particles, the collinear mode in these re- 
combinations is represented by a thick line for the jet. 

Given the recombination structure of a jet in SCET, 
it is easy to see that OPS factorization implies that the 
cross section cannot know about the relative merging or- 
dering in each tree. The soft function only knows about 
the recombinations in the soft tree, and each jet func- 
tion only knows about the recombinations in a collinear 
tree. The relative order of recombinations in each tree 



is not available to any function in a factorized cross sec- 
tion. This means that, for example, an observable that 
depends on the last recombination in the jet (the 2 — > 1 
recombination that forms the jet) cannot be factorized, 
since we cannot even know the type of recombination 
(c, c — > c or c, s — > c). This will have obvious con- 
sequences for jet substructure methods that depend on 
declustering the jet. 

A natural object that jet substructure methods deal 
with are subjets, which are clusters of particles within 
the jet. Collinear subjets contain both collinear and soft 
particles, and have the momentum scaling of collinear 
particles. Soft subjets contain only soft particles, and 
so contain contributions only from the soft tree; their 
momentum scales like soft particles. This terminology is 
useful in the following sections. 



B. Examples from Jet Algorithms 

As a warm-up to jet substructure, we apply the ideas 
of soft-collinear factorization to jet algorithms. They 
provide a familiar and simple example, and the power- 
counting analysis of OPS factorization gives insight into 
the leading behavior of the algorithms. Our primary ex- 
amples are the kp, Cambridge/ Aachen, and anti-kx al- 
gorithms, all of which factorize. SCET factorization con- 
straints for these algorithms were explored in [45]. We 
also consider the JADE algorithm, which provides a use- 
ful example of an algorithm that does not factorize. 

All of the algorithms we consider are recombination 
algorithms. These algorithms use two metrics: 



Pij : distance between i and j , 
Pi : single particle metric for i . 



(21) 



The algorithm operates by finding the smallest of all 
and pi. If the smallest is a pairwise metric p^, particles i 
and j are merged by adding their four-momentum. If the 
smallest is a single particle metric p i; then particle i is 
promoted to a candidate jet and removed from the set of 
particles. This process is repeated until all particles have 
been combined into candidate jets, and only candidate 
jets with energy or pj< greater than a cutoff are counted 
as final state jets. 

Since the jet is made up of collinear and soft parti- 
cles, there are multiple kinds of pairwise and single par- 
ticle metrics. The pairwise metrics come from collinear- 
collinear, collinear-soft, and soft-soft pairs: 

Pec , Pes , Pss ■ (22) 

There are also collinear and soft single particle metrics: 
Pc Ps- (23) 

The scaling behavior of these metrics determine the be- 
havior and factorization properties of the jet algorithm. 



1. The kr Class of Algorithms 

The most common recombination algorithms are the 
kx class of algorithms. They are parameterized by a 
number a, where a = 1 is the kx algorithm, a = 
is Cambridge/Aachen, and a = — 1 is the anti-kx algo- 
rithm. The metrics for these algorithms are 



• s 2 



Pij = min(p^,p^) ARij , 

Pi =PTiR- 



(24) 



The kp, C/A, and anti-kx algorithms produce very dif- 
ferent behavior in the structure of the jet. 

To understand the factorization properties of the algo- 
rithms, we determine the characteristic scaling of p cc , p cs , 
and p ss for central jets. These are given in Table HJ and 
in Fig. 4(a) we show the relative ranking of the pairwise 
metrics for each algorithm. As noted in Sec. II B| it is 
possible for the pairwise metric to have a scaling smaller 
than the characteristic scaling shown while contributing 
to the jet at leading power in A. We consider this where 
relevant in the following discussion. 

OPS factorization requires that the comparison of the 
metrics needed to run the algorithm decouple soft and 
collinear phase space constraints. For the kx and C/A 
algorithms, this is straightforward. First, note that the 
pairwise metrics p ss and p cs depend only on soft mo- 
menta and the jet direction ft, through AR ns . Although 
naively the phase space constraints from comparing p cc 
with p cs or p ss would ruin factorization (because we 
are making a comparison whose value depends on both 
collinear and soft momenta), this ordering is in fact ir- 
relevant. From Eq. ([9]), the energy and direction of a 
collinear particle is the same at leading order before 
and after a soft-collinear recombination. This implies 
that collinear-collinear recombinations and soft-collinear 
or soft-soft recombinations occur independently of each 
other, and which particles are merged into the final state 
jet does not depend on the relative ordering. 

As observed in Sec. |III A( the factorized cross section 
cannot know about this ordering. This gives a simple, 
general constraint on a factorizable jet algorithm: at 
leading order in the power counting, the set of particles 
included in a jet cannot depend on the relative ordering 
of p cc and pc S or p ss . 

OPS factorization for the anti-kx algorithm is slightly 
more complicated since p cs depends on both soft and 
collinear momenta. As with the kx and C/A algorithms, 
the ordering of p cc and p cs or p ss is irrelevant. However, 
the comparison between p cs and p ss for the anti-kx al- 
gorithm would naively spoil factorization. We show that 
factorization holds by considering a simple configuration 
shown in Fig. |3j which can be easily generalized. Con- 
sider a collinear particle c and soft particles s\ and S2, 
where 



AR nsl < R < AR nS2 



(25) 

If p CSl < p Sl S2 , then the c—S\ recombination occurs first 
and the jet does not include S2- But if p Sl S2 < p C81 , then 




FIG. 3: A jet of size R in the ft direction, shown schemati- 
cally in the y — <f> plane. A collinear particle, c, and two soft 
particles, si and S2, are shown. 



the Si — S2 recombination puts both soft modes either in 
or out of the jet. However, the region of phase space 
for which the s± — S2 recombination occurs first is power 
suppressed. The condition on the soft-soft angle that this 
occurs is 



AR S1 S2 < 



max(p Tsi ,p Ts2 ) 



PTc 



AR nsl <0(\ 2 R). (26) 



Since the configuration we are considering has AR n S2 > 
R, this means that only region of phase space where the 
soft-soft recombination occurs first is when the two soft 
modes are located at the jet boundary, separated by an 
angle of 0(X 2 R). This region of phase space is power 
suppressed since the total area of this region is paramet- 
rically smaller than the jet area. This means these config- 
urations do not contribute to the rate at leading power, 
and can be ignored in the same approximation. There- 
fore, the OPS factorization constraints on the pairwise 
metrics are satisfied. 

Finally, for all the algorithms one may be concerned 
that comparisons between the single and pairwise met- 
rics could spoil factorization. However, for energetic jets 
(jets that contain collinear particles), the single particle 
metric only serves to enforce the constraint that two par- 
ticles can be recombined only if their angular separation 
is less than R. For soft jets, since there are no collinear 
modes the phase space only depends on soft momenta 
and trivially factorizes. 

We have seen, using a simple power counting analysis 
with soft and collinear modes, that the kx, C/A, and 
anti-kx algorithms have OPS factorization. These al- 
gorithms can be formally factorized in SCET by provid- 
ing an operator definition for the phase space restrictions 
that the algorithm implements. We turn now to the dom- 
inant behavior of these algorithms. 



2. Characteristic Behavior of the fcy Class of Algorithms 

For each algorithm the ordering of the parametric scal- 
ing of the metrics indicate the order in which recombina- 
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TABLE I: Metric scaling for the kx class of jet algorithms. 



metrics 


1 1 

kx a = 1 


C/A a = U 


anti-kx 


a = -1 
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min irtrr ■ 1 AR ■ ~ (0(\ 2 ) 
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mm 

\PTsi PTsj J 


Ai? sls ,^0(A- 2 ) 


Pc 


p Tc R ~ O(R) 


R 


PTc 




Ps 


PTsR ~ 0{\ 2 R) 


R 


1 

R ~ 

PTs 


0(A- 2 i?) 



tions tend to occur. This gives the characteristic behavior 
of the algorithm. This behavior is well known, but it is 
instructive to see it arise from a simple power counting 
analysis, and the same arguments will be useful for jet 
substructure. We need only consider the pairwise met- 
rics, since the single particle metrics simply set the size 
of the jet. We also only deal with energetic jets, as soft 
jets have a simpler behavior but are usually removed by 
Pt cuts. 

For the kx algorithm, p ss ~ p cs <C p cc . This means 
that the first step in the algorithm is to merge soft par- 
ticles with soft and collinear particles. Eventually only 
collinear particles will remain, and the next step of the 
algorithm merges them to form jets. Since soft particles 
are merged first, the jet boundary is determined by the 
local clusters of soft radiation at the edge of the jet. This 
gives rise to the "vacuuming" effect of the kx algorithm, 
where the boundary of the jet is amorphous and tends to 
include soft radiation farther than R from the jet axis. 

For the C/A algorithm, p cc -C p ss ~ p cs . In this 
case collinear particles will be merged first, and they will 
merge into a single collinear object that has the jet en- 
ergy and direction (up to power corrections). Then soft 
particles will merge among themselves and with the jet. 
Just like the kx algorithm, the shape of the jet bound- 
ary is determined by the local soft-soft recombinations. 
However, since the metric of the algorithm weights only 
by angle and not by px (as with kx), the amount of 
vacuuming with the C/A algorithm is less than the kx 
algorithm, since soft particles at the periphery of the jet 
are less frequently merged into the jet. 

For the anti-kx algorithm, p cc -C p cs -C p ss . As with 
the C/A algorithm, collinear particles are merged to form 
a single collinear mode (the jet) first. Since p cs <?C p ss , 
the next step in the anti-kx algorithm is to merge all soft 
particles within a radius R of the jet axis. At this point 



nothing else will be merged into the jet, since all remain- 
ing soft particles are too far from the jet axis. Soft-soft 
recombinations are only relevant for pairs of soft parti- 
cles whose separation is parametrically smaller than the 
scaling would indicate, or for jets consisting of only soft 
particles. This means that the anti-kx jets will be very 
circular, as is well-known. 

We can compare the behavior of the algorithms implied 
by the parametric scaling to the behavior in Monte Carlo 
simulation. We use the Boost 2010 event samples 4 here 
and for the Monte Carlo study of pruning behavior in 
Sec. IV j33j. The particular sample we use is simulated 



QCD events with hard partons between 500 and 600 GeV, 
generated with Pythia v6.421 using the DW tune. We 
implement the standard analysis cuts suggested in the 
Boost 2010 report, keeping only visible particles (except 
muons) with |?j| < 5.0. Only the two hardest jets with 
Pt > 200 GeV are used. We use FastJet 3.0/31 to cluster 
the jets and look at the substructure |55] . 

Define the relative merging time for particle i in a jet 
to be 



relative merging time = 



'step 
'step 



(27) 



where n^ tcp is the number of recombination steps needed 
to make the jet and ng tep is the step number at which 
particle i was first merged with another particle in the jet. 
The first particles to merge have relative merging time 0, 
while a particle that is not recombined until the end of 
the algorithm has relative merging time 1. In Fig. |4(b)"| 



4 These events are publicly available from two repositories, hosted 
by |Gavin Salam| and |The University of Washing ton] 
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we cluster jets using R = 1.0 with each algorithm and 
select those with m/px < 0.25. We then plot the merging 
time versus the ratio of the particle's pr to the highest 
Pt particle in the jet for the three algorithms. The figure 
shows the density of particles in these two variables, and 
because there are many more particles with small px than 
large pt we normalize the density separately in each bin 
of pt ratio. 

From the figure, we can see that the parametric order- 
ing of recombinations of each algorithm matches well to 
the Monte Carlo events. The Monte Carlo events have 
the same ordering of merging time for soft and collinear 
particles predicted by power counting, and the particles 
are fairly tightly clustered around their parametric merg- 
ing time. Recall that this analysis is based on the para- 
metric scaling of the metric, and leading order contri- 
butions away from this scaling can give corrections to 
the behavior shown here. While the parametric scaling 
is only the dominant action of the algorithm, it is en- 
couraging that the picture of soft and collinear modes 
captures the behavior of the algorithm well. 



In order for collinear particles in a jet to be recombined 
while maintaining collimated jets, y cu t should be 0(A 2 ). 
The typical action of the algorithm is to combine soft 
particles first, followed by soft-collinear and collinear- 
collincar recombinations. In running the algorithm we 
will compare the soft-collinear metric y cs to y cn t- Since 
y cs depends on the momenta of both soft and collinear 
final state particles, this comparison cannot be factor- 
ized into separate phase space constraints on the soft 
and collinear sectors. Since the action of the algorithm 
depends on the value of the soft-collinear metric y cs , the 
algorithm cannot factorize. 

The standard example, from Brown and Stirling |66j . 
that demonstrates the problem with the JADE algorithm 
is shown in Fig. [5] When analyzed using power counting 
in SCET, it illustrates the failure of OPS factorization 
discussed above at fixed order in perturbation theory. 
This 4-parton configuration has two energetic quarks and 
two soft gluons with 



2/si S2 ^ {llci Si l yc 2 S 2 } ^ 2/cut 



(30) 



3. The JADE Algorithm 



JADE will recombine the soft gluons first, and after this 
recombination there exists a region of phase space where 



Finally we consider the original recombination algo- 
rithm, JADE [55, 56J. JADE is an exclusive algorithm, 
meaning the algorithm operates with a pairwise metric 
yij and a cut parameter y cut (instead of a single particle 
metric). When the smallest is greater than y cn t, the 
algorithm stops and all remaining particles are promoted 
to jets. 

The pairwise metric for JADE is the invariant mass, 
1 ;{P 1 +P 3 ) 2 - (28) 



Vij 



Q 2 



Although JADE is a well-defined, infrared safe jet al- 
gorithm, it was shown by explicit calculation at 0(a 2 ) 
that the leading logarithms in the dijet cross section, 
a n In 2 ™ y cut , do not exponentiate [55] . This spoils the 
perturbative expansion and prevents accurate theoretical 
predictions from being made. This calculation is an ex- 
plicit demonstration that JADE does not factorize. Fur- 
thermore, it was shown that JADE docs not satisfy a 
necessary and sufficient condition for exponentiation to 
ncxt-to-lcading logarithmic accuracy, recursive infrared 
and collinear (rlRC) safety [67 . The failure of factor- 
ization of JADE is easy to see using a power-counting 
analysis of OPS factorization, which we now show. 

The soft and collinear modes in the JADE metric scale 
at leading power as 



{Pci +PcjY 
Vcicj — q 2 



o(\ 2 



iPc+Psf (pl+PcPt) 



Q 2 

(Psi+Psjf 



Q 2 
o(\ 4 ). 



[l + 0(X 2 )} ~0(A 2 ), 



(29) 



{2/ciSi2? Vc 2 S12} ^* Ucut 



(31) 



making the event a 3-jet event. The comparison in 
Eq. pil] ) is precisely what leads to the failure of factoriza- 
tion of JADE. We note that this problem already arises 
at 0(a s ). The failure of factorization in JADE is par- 
ticularly severe since it spoils resummation of even the 
leading logarithmic series. With only one of the soft glu- 
ons in Fig. [5] (at 0(a s )) the event is a 2-jet event, but 
with two soft gluons (at 0(a 2 )) this region of phase space 
can contribute to the 3-jct rate at leading logarithmic ac- 
curacy. 



IV. FACTORIZATION FOR JET 
SUBSTRUCTURE 

We now consider factorization constraints for jet sub- 
structure. The general constraints on merging order 
in Sec. |III A| apply to substructure algorithms, and the 
lessons from the jet algorithm examples will be useful. 
We consider four different substructure algorithms: the 
mass-drop filter algorithm, pruning, trimming, and N- 
subjettiness. The MD-F algorithm uses two generic pro- 
cedures, declustering and filtering, common to many sub- 
structure algorithms. The pruning and trimming al- 
gorithms are generic grooming procedures, designed to 
remove soft radiation far from energetic clusters. N- 
subjettiness is an example of a jet shape useful for sub- 
structure, for which the factorization analysis is simpler. 

Traditional substructure algorithms use broad ap- 
proaches to improve the discrimination over QCD jets. 
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FIG. 4: (a) Relative ordering of the pairwise metrics for the anti-kx, C/A, and kx algorithms at leading power, (b) Relative 
merging times vs. particle pr ratio for anti-kx, C/A, and kx jets, using jets from the Boost 2010 sample described in the text. 
For each algorithm, the jets are found with using R = 1.0 and have px > 200 GeV and m/pT < 0.25. The density in each 
column of bins for fixed scaled particle pr is separately normalized. The relative merging times agree well with the qualitative 
ordering of the pairwise metrics. 
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FIG. 5: A simple configuration that shows the failure of soft- 
collinear factorization of JADE 66 . The algorithm combines 
the soft gluons s\ and S2 first. There is a region of phase 
space that contributes to the leading log 3-jet rate where the 
soft gluon pair does not get recombined with either collinear 
quark. 



Many algorithms place kinematic cuts on the hard split- 
ting of the boosted object to identify subjets. For ex- 
ample, the Johns Hopkins top-tagger reduces the QCD 
background by placing cuts to identify the top and W. 
It requires a hard sub jet to have a mass near m t0 p, a 
daughter subjet have a mass near mw, and places a cut 
on the helicity angle of the subjets of the W [3]. Sub- 
structure algorithms can groom the jet by removing soft, 
wide-angle radiation that is characteristic of the under- 
lying event and pileup. Finally, they also discriminate 
QCD and non-QCD jets based on the radiation pattern 
in the jet, identifying characteristics of a particular decay 



based on properties such as color. 

Recently, jet shapes have been shown to be effective jet 
substructure tools [7H5] ■ Jet shapes define an observable 
from a projection of the momenta in a jet, making a 
direct measurement of the jet without manipulating the 
jet's constituents. This makes the factorization analysis 
simple for basic shapes, but subtleties remain for more 
complex shapes. 

Factorization constrains the scaling of parameters in 
all of the traditional substructure algorithms, and we find 
that declustering and filtering do not factorize. However, 
simple modifications derived from a power counting anal- 
ysis of the algorithm can be made to allow for factoriza- 
tion. We also find that with the proper scaling for the 
parameters of pruning and trimming, jet shapes (such 
as the jet mass) pass the test for observable factoriza- 
tion. More exclusive observables such as subjet masses 
do not factorize without additional kinematic restrictions 
enforcing the subjets be well-separated. This allows them 
to be treated as separate collinear sectors and relaxes the 
factorization constraints. 

We find it convenient to first discuss pruning and trim- 
ming, as the power counting analysis is straightforward 
to apply. This analysis also produces a simple picture 
of the phase space remaining after pruning. We compare 
the predictions from power counting to Monte Carlo sim- 
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ulation and find good agreement. Next, we discuss the 
MD-F algorithm. After identifying the problems from 
factorization with declustering and filtering, we explore 
modifications that can allow for factorization. We com- 
pare the modified MD-F algorithm to the original us- 
ing a simple Monte Carlo analysis. Because many sub- 
structure algorithms are based on the declustering and 
filtering steps in MD-F, the issues we discuss for MD- 
F are more widely relevant. Finally, we discuss generic 
jet shapes and their factorization constraints, specializing 
to the case of JV-subjettiness. In general, jet shapes are 
more theoretically tractable and more amenable to cal- 
culation, and many jet shape calculations for the LHC 
already exist. 



A. Pruning 

The pruning algorithm is intended to improve the mass 
resolution of boosted heavy particles decaying to single 
jets by selectively removing isolated soft radiation in jets 
[2"Tl |2"8] . Pruning uses the fact that isolated soft radi- 
ation can contribute significantly to poor mass resolu- 
tion in jets, and that recombination algorithms naturally 
identify this radiation. In pruning, jets are reclustered 
and kinematic cuts are placed on each reclustcring step; 
soft particles at wide angles to energetic ones are removed 
from the jet. This reduces the QCD background to heavy 
particles and allows the substructure of the jet to be used 
to reconstruct the decay of heavy particles. 

Pruning removes particles through a secondary clus- 
tering procedure. Starting with jets found with large R, 
the pruning procedure is: 

1. Recluster the jets with a recombination algorithm. At 
each recombination + j, test if 



mm(pr,i,PT,j) 

PT,i+j 



< z cut and ARij > D c 



(32) 



2. If both of these conditions are met, then discard the 
softer of i,j and continue reclustering without any 
merging. Otherwise, recombine the pair and continue. 
The jet that is formed is the new, pruned jet. 

The reclustering algorithm is defined without a promo- 
tion metric, so that every particle in the jet ends up being 
reclustered or pruned (discarded) . The dimensionless pa- 
rameters z cu t and D cnt are inputs to the algorithm. In 
the pruning definition, D cut = mj/pxj was suggested, 
as was z cut ~ 0.10 — 0.15. 

Pruning is slightly different from other substructure 
methods since it reclusters the jet and applies kinematic 
cuts at each stage of reclustering, instead of first find- 
ing subjets and then implementing kinematic cuts. This 
makes the factorization considerations more straightfor- 
ward, since we can consider the effects of the pruning 
cuts at each clustering step. 

We first derive a general constraint on substructure 
algorithms: 



• A collinear subjet cannot be removed from the jet 
unless the entire collinear sector (jet or subjet) is re- 
moved. 

A generic collinear subjet, represented by a branch in the 
collinear tree in Fig. [2] has been merged with soft par- 
ticles by the algorithm. As discussed in Sec. |III A[ we 
cannot know which soft particles have been merged into 
that collinear subjet: soft particles cannot resolve indi- 
vidual collinear particles. Therefore, when we remove the 
subjet, the affect on the soft function is ambiguous. Since 
the phase space constraint that removes the collinear sub- 
jet is implemented in the jet function, OPS factorization 
requires that the soft sector cannot know about this cut. 
Therefore, removing collinear subjets will break factor- 
ization. 

The only way to evade this constraint is if an entire 
collinear sector is removed. If an entire collinear sector 
is removed, then all soft particles merged into the sector 
will be removed. This does not mean that the entire jet 
has to be discarded, however. If we implement phase 
space cuts so that a collinear subjet is well separated 
from all other collinear particles in the jet, then we can 
treat that subjet as its own collinear sector. By well- 
separated, we mean that the subjets are collimated with 
respect to their separation, e.g., m Jlj2 3> {mj 1 ,mj 2 }. In 
general, though, it is not physically sensible to remove 
an entire collinear sector. Collinear radiation typically 
comes from energetic particles in a jet's evolution, and is 
a part of any useful substructure observable. 

The requirement that collinear particles not be re- 
moved from the jet constrains the scaling of the parame- 
ters z cu t and -Dcut, since c, c — ¥ c recombinations cannot 
be pruned. In Fig. |6j we show the scaling of the kine- 
matic variables z and Ai? for the three kinds of recom- 
binations. We include the contribution from all regions 
of phase space allowed in the effective theory that could 
contribute at leading power, including those with non- 
characteristic scaling (e.g., AR CC ~ 0(A 2 )) as discussed 
in Sec. iHBl 

From the figure, we can see that no collinear subjets 
will be removed if z C ut is chosen to scale as 



^cut < 0(A) . 



(33) 



No scaling constraint needs to be made on D cut , since 
this is a cut on the relative angle between particles that 
depends entirely on either collinear (for c, c — ¥ c merg- 
ing) or soft (for c, s — ¥ c or s, s — > s merging) momenta. 



With the scaling choice in Eq. ( 33 ) , pruning satisfies the 



necessary condition for OPS factorization and could be 
factorized. We note that the constraint on the scaling of 
2; C ut is consistent with the intention of pruning to remove 
soft, wide angle radiation from the jet substructure. The 
original choices for the parameters, z cu t = 0.10 — 0.15 
and D C ut = m j/PTj, are consistent with z cut ,D cut ~ A, 
which is depicted in the right side of Fig. [6j 
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FIG. 6: The scaling of recombination variables z and AR in the pruning algorithm. Left: table of scaling for collinear-collinear, 
collinear-soft, and soft-soft recombinations. Right: Plot of AT? vs. z, and the regions that each type of recombination can 
occupy and still give a leading power contribution to the cross section. The shaded region is the region where pruning takes 
place, with the choice z C ut,-D cu t ~ 0{X). 



1. Characteristic Behavior of Pruning 

Just as we did for jet algorithms, we can apply the 
ideas of power counting to study the behavior of pruning. 
Using power counting, we can develop a picture of what 
remains in a jet after pruning for different reclustering 
algorithms (here we consider k T , C/A, and anti-k T ). Fo- 
cusing on energetic jets with a single collinear sector, we 
find that this simple picture describes the jets remarkably 
well, as the qualitative picture agrees with the behavior 
of pruned jets in a Monte Carlo simulation. 

We choose z cu t and D cut to scale as A, as in the stan- 
dard pruning implementation. We will make use of Fig.|6j 
which plots the regions where each type of recombination 
can contribute to the cross section at leading power. Two 
facts help us develop a picture of pruning that we can ap- 
ply to different algorithms: 

• Factorization requires that no collinear subjet be 
pruned. 

• Every soft subjet will eventually be rccombined with 
a collinear subjet, where it will be pruned unless 
AR ns < D cut . 

The first item means that the full collinear sector will re- 
main after pruning. The second means that we can deter- 
mine which soft particles remain after pruning by deter- 
mining how unpruned soft-soft recombinations shape the 
soft phase space. The relative ordering of soft-collincar 
and soft-soft recombinations in a reclustering algorithm 
will determine the soft phase space after pruning. We 
refer to Fig. 4(a) and the discussion in Sec. I1IB for the 
ordering of recombinations for the anti-kx, C/A, and kx 
algorithms. 

To determine the soft region that remains after prun- 
ing, we will consider the configuration shown in Fig. [3] 



with the jet size R replaced by D cut . The pair of soft 
particles, s\ and S2, have 

AR nsl < D cut < AR nS2 . (34) 

If si is merged with the jet before si and S2 are combined, 
then S2 will be pruned. Therefore we can determine when 



Pa 



< Pn 



(35) 



which will tell us the parametric size of the region of soft 
phase space that remains after pruning. 

The anti-kx algorithm characteristically merges soft- 
collinear pairs before soft-soft pairs. The comparison in 
Eq. ( 35 ) for anti-kx is 



AR slS2 < mhi(pT ^ ) AR nsi «D c 

PTc 



(36) 



The ratio of soft and collinear px requires Ai? Sl S2 <S 
Ai?„ Sl , and so the region of phase space where S2 is not 
pruned is power suppressed. This means that for anti- 
kx, the soft phase space after pruning is simply a disk of 
radius Z? cu t centered on the jet axis up to power correc- 
tions. 

The C/A algorithm characteristically merges soft- 
collinear and soft-soft pairs simultaneously. The com- 
parison in Eq. (35) for C/A is 



AR S1 S2 < AR nsl < D c 



(37) 



This implies that soft particles within an angle 2D cut 
of the jet axis can remain after pruning. Note that as 
the angle of S2 to the jet axis grows, the recombined 
pair of soft particles are more likely to be farther than 
D cut from the jet axis and be pruned. Multiple soft-soft 
recombinations will mitigate this effect, and will tend to 
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allow for wider angle soft particles to be merged into the 
jet. However, the essential feature is that we expect the 
soft phase space to extend out to a radius 2D cut , twice 
that for anti-kr. In Fig. [7J we show the basic steps of 
pruning for C/A. 

The kr algorithm characteristically merges soft- 
collinear and soft-soft pairs simultaneously, as in the C/A 
algorithm. The comparison in Eq. ( 35 ) for kx is 



< 



PT Sl 



mm(p Tsi ,pT S 



-AR 1lSl 



(38) 



If Pts 2 > PTsn then this constraint is the same as C/A. 
But if pts 2 < PTs x 7 then S2 can be at wider angle to 
the jet axis and still be merged with s\. Additionally, 
since S2 is softer in this case, it is more likely that the 
recombined pair of soft particles will be closer than D cut 
from the jet axis. This suggests that the soft phase space 
remaining after pruning will be a disk of radius 2D cut (as 
for C/A) plus a region that extends out to larger angles 
where a fraction of the soft particles are retained. 

We can test this basic picture predicted by power 
counting by looking at the effect of pruning on jets from 
the Boost 2010 samples. Using the same events as in 
Fig. [2j we find jets with the anti-kx algorithm using 
R = 1.0. Selecting jets with m/px < 0.25, we prune 
them using anti-kr, C/A, and kx reclustering using the 
FastPrune vO.4.3 plugin for FastJet 5 , choosing D cut = 0.1 
and z cut = 0.1 [68 . 

To quantify what remains after pruning, we bin all par- 
ticles in their relative Ay and A<j) to the jet axis. For each 
bin, we sum over all jets and determine what fraction of 
Pt remains after pruning. This tells us how much pt re- 
mains after pruning as a function of the location in the 
jet. For each algorithm, this is shown in Fig. [8j We also 
show the region predicted by power counting where the 
soft phase space remains largely unpruned and most of 
the energy remains. The Monte Carlo simulation gives 
excellent agreement with the basic predictions of power 
counting. 



B. Trimming 

The trimming algorithm is intended improve the res- 
olution of jets coming from light partons by removing 
contamination from initial state radiation, the underly- 
ing event, and pileup |29j . Trimming uses the fact that 
the energetic components of final state radiation in jets 
are well collimated and can be clustered into subjets by 
a jet algorithm with radius i? su b <C R- Fat jets used 
to capture nearly all of the radiation from the evolution 
of an energetic particle can then be trimmed, and the 



mass resolution of reconstructed decays is subsequently 
improved. 

Trimming defines subjets through a secondary jet al- 
gorithm. Starting with jets found with large R, the trim- 
ming procedure is: 

1. Recluster the jet with an algorithm with radius 
Rsub <C R- The jets found by this algorithm are can- 
didate subjets. 

2. Remove any candidate subjets with pt < p T ut = 
/cutAhard, where / cut and A hard are parameters. The 
remaining subjets form the new, trimmed jet. 

In the original trimming definition, A narc : ~ Ptj or v5, 
and / cu t <C 1 is a dimensionless parameter. 

As discussed for pruning, factorization requires that a 
collincar subjet cannot be removed unless phase space 
constraints require that the subjet is well separated and 
can be treated as its own collinear sector. This requires 
that p™* must scale with A such that the cut does not 
remove collinear subjets: 



p T ut 
Ptj 



< A. 



(39) 



Pruning and other substructure algorithms are natively imple- 
mented in FastJet 3.0.0, released as this work was being com- 
pleted. 



Any larger scaling would allow a region of phase space 
where collinear subjets are removed. 

No analogous constraint exists for the subjet radius 
R su b- However, an important principle applies in choos- 
ing the value of i? S ub- The restriction operator 1Z imple- 
ments the phase space cuts of the measurement. These 
cuts determine the kinematics of the jets that count 
towards the cross section. The cuts that substructure 
methods place tend to select for kinematics that resem- 
ble a heavy particle decay and look unlike QCD. For in- 
stance, many substructure algorithms require well sep- 
arated clusters of energy in the jet that look like sub- 
jets from a decay, strongly cutting on the QCD back- 
ground. The choice of i? S ub will affect the resulting sub- 
structure, especially if additional constraints are placed 
on the collinear subjets. 

When considering what types of observables have OPS 
factorization, we find a general constraint on jet substruc- 
ture: 

• We cannot measure observables for individual subjets 
unless the subjets are well-separated in the jet. 

Because soft particles do not resolve individual collinear 
particles, the soft constituents of an individual subjet are 
ambiguous. This implies that we cannot measure certain 
properties of individual subjets and maintain factoriza- 
tion. For example, the subjet mass depends on know- 
ing the soft and collinear constituents of the subjet. Al- 
though we know which soft and collincar modes are in the 
jet after trimming, we do not know the distribution of the 
soft modes inside the collincar subjets. This implies that 
we can factorize jet shape observables which act on the 
whole (trimmed) jet, but not subjet observables that re- 
quire knowing the soft constituents of particular subjets. 
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FIG. 7: The stages of pruning for C/A reclustering. Groups of soft particles are red and groups of collinear particles are blue. 
First, all pairs of particles at small angles are merged (a — >■ b). Then larger angle recombinations occur; collinear particles are 
merged together, and some pruning takes place (b — > c). Finally wide angle pairs of particles are merged, and any soft particles 
farther than « 2D cut from the jet axis get pruned (c — > d) . 



anti-kx pruning C/A pruning kx pruning 




Ay Ay Ay 

FIG. 8: The fraction of pr remaining after pruning as a function of the location in the jet, for an ensemble of jets from 
the Boost 2010 events described in the text. The jets are found with anti-kT using R = 1.0 and have px > 200 GeV and 
m/pT < 0.25. Pruning was performed with anti-kx, C/A, and kx reclustering, with z cut = 0.10 and D cut =0.1. In each bin 
of size 0.05-by-0.05 in Ay and A(j>, we plot the fraction of pr that remains after pruning. The green circle shows the region 
predicted by power counting where most of the energy remains. This circle has radius D cut for anti-kx and 2D cut for C/A and 
kx- 



This constraint applies to other substructure algorithms 
and jet algorithms in general. 

If we are interested in measuring more exclusive jet 
observables, we can impose additional constraints on the 
subjet clustering procedure. If we require the collinear 
subjets to be well-separated, then we can treat each sub- 
jet as a distinct collinear sector. This means separate 
collinear subjets would be equivalent to separate jets 
from the perspective of soft modes. The soft function 
can know about the direction and pt of each subjet, and 
soft modes can be assigned to specific collinear subjets. 

It is fairly straightforward to require that each subjet 
be well-separated. For instance, we can require that the 
collinear subjets be separated by an angle R SO p 3> Rsub- 



While this will make trimming more complex, such con- 
straints are necessary for factorization of jet observables 
which measure properties of individual subjets sensitive 
to soft momenta. 



C. The Mass-Drop Filter Method 

The mass-drop filter (MD-F) method is intended to tag 
boosted Higgs decays into a fat jet [T2]. The algorithm 
works by declustering a found jet, stepping backwards 
through the recombinations, until a declustering charac- 
teristic of a boosted decay is identified. These objects 
are further declustered into subjets, and these subjets 
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are filtered by removing the softer subjets. Decluster- 
ing and filtering remove contamination from underlying 
event and pileup, which has a characteristically lower 
energy scale than the hard radiation in the jet. The 
initial declustering step significantly reduces the QCD 
background, while the filtering step improves the mass 
resolution of jets containing a boosted Higgs. 

The MD-F method starts with jets defined by the C/A 
algorithm. MD-F defines two subjets through a declus- 
tering procedure which steps back through the recombi- 
nations of the jet algorithm: 

1. Label the jet J and perform a single declustering step 
to obtain two candidate subjets ji and j 2 with rrij 1 > 

2. On the candidate subjets, test if 

m ji j 
a = — — < (i and 

mj 

minfpS, ■ , ) 

v= Z\ ( 4 °) 

3. If these cuts are passed, the splitting J — > j 1 , j 2 is 
a hard declustering, and ji, J2 are the hard subjets; 
proceed to the next step. If these cuts are not passed, 
then redefine the jet to be the heavier subjet j\ and 
return to step 1. Repeat until hard subjets are found. 

The parameters \i and y cut are inputs to the algorithm. 
These subjets are then filtered: 

4. Define Rfat = min (0.3, l-R^jO- Decluster ji and j 2 
down to i?fiit and keep the 3 highest pr subjets from 
the declustering. These 3 subjets define the new jet. 

Filtering steps back through the recombinations of the 
C/A algorithm until the declustering angle is less than 
.Rfiit, keeping only the highest px subjets from the declus- 
tering. 

The characteristic scaling of the variables a and y used 
in the declustering step are given in Table [TT] MD-F is 
applied to energetic jets and since declustering discards 
the lighter (softer) of the two subjets, the parent jet in 
declustering will always be collinear. 

The original choices for the parameters fi = 0.67 and 
2/cut = 0.09 are consistent with \i ~ A and y cut ~ A [12] . 

The MD-F algorithm introduces many complications 
for factorization. We first analyze the declustering pro- 
cedure that determines the hard subjets, followed by the 
filtering procedure. 

1. Declustering 

The basic process of declustering requires knowledge 
of the ordering of recombinations in the jet. However, 
as discussed in Sec. |III A[ the relative ordering of recom- 
binations in the soft and collinear sectors is not avail- 
able in a factorizable cross section. Therefore we cannot 



know which declustering step comes first: c — > c, c or 
c — > c, s. Unless additional kinematic constraints are 
placed on the substructure, MD-F and other algorithms 
that use declustering do not factorize. 

Using insight from the previous examples, we discuss 
how a declustering method could be factorized. If there 
are two well-separated collinear sectors in a single jet, 
then the first c — > c, c declustering will resolve these 
collinear sectors. In this case the two collinear subjets 
from the declustering can be resolved by the soft sector, 
as they are essentially separate jets. This means that the 
kinematics of the c — > c, c splitting can be known in the 
soft sector, and the constraints from merging order no 
longer apply. 

The idea that the first c — > c, c declustering define the 
hard subjets is physically sensible, since the kinemat- 
ics resemble a hard splitting. Furthermore, in the initial 
c — ¥ c, s splittings the soft particles will be at wide angles 
to the jet, which are the recombinations that most sub- 
structure methods aim to remove. After discussing the 
filtering aspect of MD-F, we perform a power counting 
analysis on the declustering step of MD-F and suggest 
ways to require the jet to have multiple well-separated 
collinear sectors, using the kinematic cuts in MD-F. 



2. Filtering 

The filtering procedure also introduces complications 
for factorization. The basic process of filtering is simple: 

• Decluster to a given scale and keep only the N hardest 
subjets. 

Without any additional kinematic constraints, this vio- 
lates factorization. The collection of N subjets that is 
kept after filtering will have N co \\ collinear subjets and 
N — N co \\ soft subjets. If is the total number of 

collinear subjets before the filtering step is applied, then 

AUl = min(A, JV££) . (41) 

Unless N co \i is fixed by a kinematic constraint, the num- 
ber of soft subjets that are removed by filtering depends 
on A co n, which is a phase space constraint coming from 
the collinear sector. Since the soft function cannot know 
about this constraint, factorization is broken. 

The similarity of trimming to the filtering step suggests 
a simple alternative. Instead of a cut keeping the N hard- 
est subjets, if only subjets with a pr > p^ are kept, 
as in trimming, then if ~ C(-M only the collinear 
subjets will be kept and factorization can be preserved. 
As with trimming, only jet shape observables that sum 
over all particles in the jet can be factorized. Observ- 
ables such as subjet masses, where it is required to know 
which collinear subjet a soft particle is in, do not factorize 
without additional kinematic constraints on the subjets. 



17 



TABLE II: Scaling for the variables a and y in the MD-F algorithm. 



declustcring 


a 
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Cj -> Ci , c 2 


m d . A o 

m cj 




cj ->• Cx , s 2 


TO d _ A o 

m C-7 


2 ^ R n,s 2 ~ A 



3. Making MD-F Factorizable 

A power counting analysis of the declustering step of 
MD-F can reveal simple ways to impose the kinematic 
constraint that the jet contain multiple well-separated 
collinear sectors. The kinematic cuts of the MD-F al- 
gorithm make it natural for the first c — ¥ c, c decluster- 
ing to define the hard subjets. We only need to impose 
the constraint that the declustered collinear subjets are 
well-separated, i.e., that they are collimated compared to 
their separation, so that the collinear modes in each sub- 
jet do not overlap. Furthermore, we must require that 
all c — > c, s splittings fail the hard subjet cuts. 

Consider the power counting of collinear-collinear and 
collincar-soft splittings in terms of a and y, given in Ta- 
ble |nl These are valid when there is one collinear sector 
in the jet. If we choose 

y cut ~ A , (42) 

then c — > c, s splittings will always fail this cut, and c — > 
c, c splittings will pass this cut. Since c — > c, c splittings 
have a ~ A , unless /i > a max there is a region of phase 
space where c — > c, c splittings will fail the cut on a. 
However, we can use this cut to require that the jet have 
two well-separated collinear sectors. If fi ~ A, then any 
splitting J — > ji , ]2 with a < fi will have 

ra n < m jl -C nij . (43) 

This implies that the subjets are collimated compared to 
their separation. A c — > c, c declustering will only pass 
this cut if the two collinear subjets are well-separated, 
meaning we can treat them as separate collinear sectors. 
Therefore, if we require 

M ~ A , y C ut - A , (44) 

then the declustering step of MD-F passes the power 
counting test for soft-collinear phase space factorization. 
With this choice, a c — > c, c declustering where the two 
collinear subjets are separate collinear sectors will pass 
the cuts. Only jets with two (or more) collinear sectors 
will pass the cuts. While [i ~ A is more restrictive than 
H ~ A , it is required for OPS factorization. The hard 



subjets can subsequently be filtered using a factorizable 
procedure discussed above. 

Reducing the value of fi places more restrictive cuts 
on the substructure, which will lower the Higgs tagging 
efficiency. This is a general property of more exclusive 
factorization theorems - we must be more exclusive in the 
jet substructure, to ensure that factorization constraints 
can be satisfied. However, a lower signal efficiency will be 
accompanied by a lower background mistag rate, mean- 
ing the substructure method can still be effective. For the 
MD-F tagger, we perform a simple Monte Carlo study to 
determine the impact of choosing a smaller /i. We gener- 
ate Zj and ZH events with m# = 115 GeV using Pythia 
v8.145 [52], and select jets with pt between 200 and 300 
GeV, decaying the Z lcptonically. We carry out the MD- 
F algorithm with the default value y cut = 0.09 and two 
\x values, 0.67 (the default value) and 0.25 (the small 
value) . 

In Fig. |9j we plot the mass distributions for the Higgs 
signal and the background, for both values of \i. We can 
see that while the overall tagging efficiency is reduced, 
the QCD background significantly decreases. Choosing a 
mass range for the Higgs candidate of [100, 130] GeV, we 
find that with a small fi the tagging efficiency is reduced 
to 48% of the efficiency with the default parameters, with 
the signal purity (signal over background, S/B) increases 
by a factor of 1.8 and the significance (signal over noise, 
S/ \B) decreases by a factor of 0.94. We note that choos- 
ing a more restrictive value of \x will further reduce the 
tagging efficiency. The significance is nearly unchanged 
while the lower efficiency is accompanied by a higher sig- 
nal purity. 



D. N-subjettiness 

A^-subjettiness is a jet shape constructed analogously 
to Af-jettiness, an event shape [?7]. Af-jettiness provides 
a veto on additional jets, effectively characterizing how 
< Af-jet-like an event is (N jettiness does not veto against 
fewer jets - if A-jettiness is small, so is M-jettiness for 
M < A). Af-subjettiness is a natural generalization 
of this. It provides a veto against additional subjets 
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FIG. 9: The Higgs mass distribution (left) and background jet mass distribution (right) for the MD-F algorithm with the 
default n (0.67) and small fi (0.25). The small /i allows for factorization. With small n the signal efficiency is lower, but there 
is a compensating drop in the mistag rate. 



through a measure [3 19] , denned as 



((9) - • 1 

t x n = mm — > p Tl mm 

{«/} «o 



in{(AR lii f ,...,(AR N>i f] 



where the measure ARj^ is the separation between a 
particle i and subjet J. The inner minimization picks 
the smallest ARj^ over all N subjets J, while the outer 
minimization chooses the subjet axes rij that minimize 
the sum. The normalization factor do is 



d o = J2pTiR 



(46) 



with R the jet radius. The power /3 is required to be 
non-negative by collinear safety. 

When iV-subjettiness is small, the jet is < iV-subjet- 
like. A large value of tjv means there are more subjets in 
the jet, which provides the veto against additional sub- 
jets. Because of this, the ratio tjv /tn-i is a useful dis- 
criminant to identify boosted TV-body decays. This ratio 
will be small for the iV-sub jet-like boosted decay, since Tjv 
will be small and t^-i will be large. For QCD, however, 
we expect this ratio to be 0(1) except for 3 (or more) 
subjct-like jets, which are rarer in QCD. When com- 
pared to other techniques using the Boost 2010 Monte 
Carlo samples, this single variable provides a more effec- 
tive top tagger over nearly the entire range of top tagging 
efficiency [9]. We now discuss factorization considera- 
tions for jet shapes in general, and some of the subtleties 
that arise with iV-subjettiness. 

Most shapes (like jet mass or 7V-subjettiness) are linear 
in the final state momenta, taking the form 



(47) 



where collinear safety requires / T (fc) oc (-Efc) 1 - For these 
linear jet shapes, the observable splits into separate sums 



of collinear and soft contributions, 

T = T C +T S =J2 fAM) + E f-( k t) ■ ( 48 ) 
Ci£j SiE.J 

For simple functions / r , this makes OPS factorization 
straightforward. There are separate contributions for soft 
and collinear modes, and the only phase space constraints 
that can spoil factorization are in the algorithm used to 
find the jets. 

For non-linear jet shapes, such as jet mass, one must 
ensure that the soft and collinear contributions separate. 
Typically this involves showing that the soft contribu- 
tions depend only on the total jet momentum instead of 
the momentum of individual collinear particles. For in- 
stance, for collinear-soft pairs their contribution to the 
jet mass is 



E E ° E > AR ns = E J E E > AR n* 



(49) 



c,s£j 



s£j 



This gives a contribution like m 2 s in Eq. ( 10 1 in the case 
of the hemisphere jet mass. 

We note that factorization for jet shapes is generally 
limited to some regime of the observable where soft and 
collinear modes dominate the contribution to the observ- 
able. For example, soft and collinear dynamics describe 
the regime of small jet mass, mj <C Ej. In the large jet 
mass regime, mj ~ Ej, hard modes contribute and per- 
turbation theory is sufficient to accurately calculate the 
mass distribution. For iV-jettiness, the region of small 
tn contains large logarithms of rjy in the perturbative 
series, and factorization is needed to rcsum these logs. 
In the region of large tn, factorization is not needed and 
perturbation theory can be used. 

A power counting analysis of the jet shape observable 
can be useful in determining if the soft and collinear 
modes of SCETi provide the relevant description of the 
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final state. This addresses the first key step in the proof 
of factorization outlined in Sec. II C . We carry out this 
analysis for iV-subjettiness here, which follows directly 
from the study of angularities [7UJ HI] in the effective 
theory [33 EU HO Hj] . For (3 = (2 - a) N-subjettiness 
has similar kinematic dependence to the angularity jet 
shape T a . We can look at the contributions of a soft and 
collinear particle to 7V-subjettiness: 



collinear: r 



soft: 
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J/8)* 
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da 



Pt (Ai?j, s 



A 



(50) 



The relationships between the power counting parameter 
and the jet and soft scales determines (j.j t s in terms of 
the observable, rffi- 



fij ~ EjX 
fi s ~ EjX 2 



E 



(*>) 



1//8 



(51) 



For [i > 1 the hard, jet, and soft scales are well separated, 
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1//3 



(52) 



and SCETi is a useful effective theory description of the 
small A^-subjettiness region, TjS <C 1. When (3 = 1, 
fij = fj,g and therefore the soft and collinear modes of 
SCETi do not provide the correct description and a dif- 
ferent EFT is required. In this case SCETn, with soft 
modes which scale as Ej(X, X, A), provides the appropri- 
ate description of iV-subjettiness. This corresponds to 
the angularity a = 1, known as jet broadening [T2~M75j . 

TV-sub jettiness is defined with respect to a projection 
of soft and collinear momenta along subjet axes. OPS 
factorization for iV-subjettiness in SCETi requires that 

• Subjet axes obtained by minimizing iV-subjettincss 
must be independent of the soft momenta in the jet, 
up to power corrections. 

This says that if iV-subjettiness OPS factorizes in SCETi 
(the appropriate theory for (3 > 1) the minimization 
over nj in Eq. ( 45 ) is determined by the sum over the 
collinear particles in each subjet. The soft contribution 
only changes the jet axis by a small angle, which gives 
a power suppressed contribution to rjy . For (3 = 1 the 
subjet axes are different from the sum of collinear parti- 
cles in the subjet by an 0(1) amount, [3]. The contribu- 
tion of the soft momenta of SCETn must be considered 
and in the case of jet broadening this is taken in to ac- 
count by a recoil term [75J [75] . In [§] , the authors discuss 
several ways to determine the subjet axes, and it would 
be interesting to determine whether the various choices 
allow for factorization. 

In iV-subjettiness, each subjet axis is associated with 
a subjet, made up of those particles whose contribution 



to rjv comes from that subjet. In a regime where tn is 
small, each subjet's contribution to tn is also small. This 
implies that the clusters of collinear particles in each sub- 
jet are well separated, and we can treat them as distinct 
collinear sectors. Therefore the soft particles can resolve 
the different subjets, and we can completely determine 
the constituents of each subjet. 



V. BEYOND SOFT-COLLINEAR 
FACTORIZATION 

The analysis we have presented applies to a wide swath 
of jet physics applications where soft and collinear modes 
dominate observables of interest and factorization of jet 
substructure algorithms is desired. In this work we 
have discussed factorization constraints in the context 
of SCET, which provides a rigorous and powerful frame- 
work for factorization and resummation. We stress that 
the underlying principles of SCET are no different than 
QCD, and factorization in SCET simply frames factoriza- 
tion in QCD in an effective theory picture. This effective 
theory picture changes depending on what kind of ob- 
servables we are studying, and so we discuss cases where 
the OPS factorization analysis must be considered more 
carefully. 

First, we note that for jet physics applications with no 
large logarithms in the perturbative series, resummation 
may not be necessary to obtain accurate predictions 6 . In 
this case a fixed order calculation would be sufficient and 
SCET would not apply. As we have seen, this is not the 
case for most jet substructure methods. The presence 
of small parameters often guarantees large logarithms in 
the perturbative series, requiring resummation. 

When resummation is required, considerations of the 
accuracy needed in the resummed prediction are impor- 
tant. Often, a leading log (LL) or next-to-leading log 
(NLL) calculation has sufficient accuracy to compare to 
data, and other parts of the cross section (such as non- 
perturbative corrections) can have a larger error than the 
perturbative uncertainty. In these cases an all-orders, all- 
logs factorization is more powerful than is needed, and 
the constraints of OPS factorization may be relaxed. For 
example, the violations to factorization from the declus- 
tering and filtering steps may only occur at the next-to- 
next-to-leading log (NNLL) level. If NLL resummation is 
all that is desired, then NNLL violations to factorization 
are not relevant to the calculability of the substructure 
algorithm. In general, determining the order of violations 
to factorization is challenging, unlike the OPS factoriza- 
tion which is straightforward, but it would be interesting 



6 Even when logarithms are not very large, resummation can be 
helpful in reducing theoretical uncertainties and improving con- 
vergence. This is often the case when dealing with logarithms of 
scale ratios in a regime where the scales are transitioning from 
being disparate to being of the same order. 
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FIG. 10: Hierarchy of scales for two near-by jets relevant for 
jet substructure applications. 



to determine this order for declustering and filtering. We 
note that a study on the non-global logarithms in the 
MD-F algorithm suggests that violations to factorization 
do not appear at NLL [75] . 

There are many experimentally interesting jet observ- 
ables that introduce additional scales beyond the simple 
picture in Fig. |TJb) . For example, nearly all jet sub- 
structure methods introduce small parameters that will 
generate additional large logarithms in a perturbative se- 
ries. This complicates both the proof and the structure 
of the factorization theorem, and can reduce accuracy 
of predictions. It requires either an extension of SCETi 
with new modes that arise due to the presence of addi- 
tional scales or an entirely different EFT. For example, 
iV-subjettiness with j3 = 1 is a case where SCETn is 
the appropriate effective theory instead of SCETi. As 



we have seen in Sec. IV D a power counting analysis of 
the scale dependence of the soft and collinear modes can 
determine which effective theory is required. In cases 
where an effective theory different from SCETi is used, 
the first of the four key steps in the proof of factoriza- 
tion described in Sec. |nC] must be carefully considered, 
namely whether the soft and collinear modes of SCETi 
contribute to the jet observable. 

Certain jet physics applications may require a new 
effective theory description, one containing additional 
modes. These go beyond the soft and collinear modes 
in Eq. ([3]). There is active work on jet physics problems 
of this type H2H3S1 ECU El E7J EH- While the factor- 
ization theorems for these applications are undoubtedly 
more complicated, soft-collinear factorization is still at 
the heart of these problems. The analysis we presented 
in this paper is still relevant and in fact the presence of 
additional modes makes the constraints on soft-collinear 
phase space factorization more restrictive. 

We consider an example, relevant for jet substructure, 
where an extension of SCETi is needed. Additional scales 
are introduced by the observable, and to deal with them 
we use SCET+ 164 . This is an EFT constructed to de- 
scribe the configurations of the type depicted in Fig. [ljb) 
for rrij <C mj <C Pt, where m, is the subjet mass and mj 



is the invariant mass of the two nearby subjets, which 
is of order of the mass of the decaying boosted object. 
SCET + is an extension of SCETi with a csoft (collinear- 
soft) mode. The modes in this theory have scaling 



collinear: p, 



soft: p. 



csoft: p, 




(53) 



The nearby subjets lead to a large logarithm of the pair- 
wise invariant mass, himj/pT, in the perturbative series. 
Without the csoft mode, this log cannot be resummed. 
If we try to describe the two subjets with one collinear 
sector then we will not be able to simultaneously sum the 
large logs of the jet mass, mj, and subjet mass, rrij. The 
cross section for pp — >■ N jets takes the schematic form 

N 

<t n = H+(m,j) H N -i{p T ) B a (mj)Bb(m,j)Y\_Ji(™>j) 

i=l 

® S+(m?/mj) ® S N -i(m 2 Jp T ) , (54) 



where the hard function Hn and soft function Sn of 
Eq. ([2| have been further factorized in to two separate 
functions that each depend on a single scale. Despite 
the additional modes, the separation of soft and collinear 
phase space constraints is still a necessary condition for 
factorization of jet and soft functions for observables in 
SCET+. 

It is important to note that this factorization requires 
the collinear subjets to be well-separated, e.g., rrij <C mj, 
as shown in Sec. ffV] The utility of most substructure 
methods also relies on this separation, since if the boost 
of the heavy particle is too large, the subjets begin to 
overlap and can no longer be separately reconstructed. 
In the EFT this is reflected by the fact that they are no 
longer described by two separate jet functions, leading to 
an important factorization constraint for jet substructure 
methods. 

Aside from cases where a new effective theory is re- 
quired, in many jet physics applications techniques do 
not exist to resum some of the large logs in the perturba- 
tive series. This is an ongoing area of study, and SCET + 
is an example where a new technique to resum classes of 
logarithms (in this case when two collinear sectors be- 
come close) was developed. One of the main challenges 
in this field is non-global logarithms [75] ■ Non-global logs 
arise from measurements over a restricted region of phase 
space, and are relevant for nearly every jet observable 
at the LHC. These logs start at cv 2 ln 2 , and spoil even 
NLL resummation (in the exponent). Resummation of 
the leading non-global logs exists only in the large N c 
limit, which is analytically quite challenging |80) . Inter- 
est in understanding these logs from the effective theory 
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perspective is ongoing [531 EH HE]- Beyond non-global 
logs, it is currently not known how to sum logarithms of 
parameters in the jet algorithm such as the jet or subjet 
radius (which can be small for substructure algorithms) 
or the pt cut used to veto additional jets. Similarly, 
small parameters in jet substructure methods may need 
additional resummation. As jet substructure matures, 
improvements in these areas seems likely. 

VI. CONCLUSIONS 

We have explored factorization constraints for jet sub- 
structure algorithms. Factorization plays a key role in be- 
ing able to calculate and predict jet observables, and the 
constraints from factorization provide an important the- 
oretical input for substructure. SCET gives a framework 
to explore factorization, and OPS factorization is a key 
step. OPS factorization is the physical statement that 
the phase space constraints on soft and collinear particles 
are independent at all orders in perturbation theory. It is 
a requirement of observables that factorize in SCET, and 
provides strong constraints on jet observables, especially 
more exclusive ones like jet substructure. Additionally, 
the power counting analysis that SCET provides is a use- 
ful tool to characterize the behavior of jet algorithms and 
substructure. 

After discussing the factorization of well known jet al- 
gorithms and using power counting to study their behav- 
ior, we examined four substructure algorithms in Sec. |IV| 
the mass-drop filter (MD-F) algorithm, pruning, trim- 
ming, and .ZV-subjettiness. MD-F uses to basic steps com- 
mon to nearly all substructure algorithms: declustering 
and filtering. Declustering steps back through recom- 
binations, while filtering removes soft subjets. Both of 
these methods violate factorization, but a simple modi- 
fication of the phase space cuts can specialize to a kine- 
matic regime where factorization is allowed. We studied 
the numerical impact of these modified cuts on Higgs tag- 
ging using a Monte Carlo study. Pruning and trimming 
recluster the jet and make kinematic cuts to remove wide 



angle, soft radiation. Each works uniquely and a sim- 
ple power counting analysis reveals that factorization re- 
quires a certain scaling for parameters in each algorithm; 
this scaling is consistent with the original implementa- 
tions. We were able to use power counting to develop a 
simple picture of the remaining phase space of a pruned 
jet, and we found good agreement with this picture in a 
Monte Carlo study. The factorization considerations for 
iV-subjettiness, and jet shapes in general, are simpler. 
Jet shapes measure a specific observable useful for sub- 
structure on all of the constituents of the jet, instead of 
modifying the substructure. We discussed some of the 
subtleties of OPS factorization for more complex shapes 
like A-subjettiness. 

As jet substructure methods are further explored both 
theoretically and experimentally, constraints from fac- 
torization and the power counting analysis that SCET 
provides are useful analysis tools. Theoretical calcula- 
tions of jet substructure will play an important role in 
understanding the data and the modeling of the Monte 
Carlo, and theoretically calculable substructure methods 
are needed. The future of jet substructure promises to 
give many exciting results, and a deeper understanding 
can help probe the nature of new physics at the LHC. 
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